From Glossaries to Ontologies: Extracting Semantic Structure from Textual Definitions
نویسندگان
چکیده
Learning ontologies requires the acquisition of relevant domain concepts and taxonomic, as well as non-taxonomic, relations. In this chapter, we present a methodology for automatic ontology enrichment and document annotation with concepts and relations of an existing domain core ontology. Natural language definitions from available glossaries in a given domain are processed and regular expressions are applied to identify general-purpose and domain-specific relations. We evaluate the methodology performance in extracting hypernymy and nontaxonomic relations. To this end, we annotated and formalized a relevant fragment of the glossary of Art and Architecture (AAT) with a set of 10 relations (plus the hypernymy relation) defined in the CRM CIDOC cultural heritage core ontology, a recent W3C standard. Finally, we assessed the generality of the approach on a set of web pages from the domains of history and biography.
منابع مشابه
Presenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملWeakly Supervised Definition Extraction
Definition Extraction (DE) is the task to extract textual definitions from naturally occurring text. It is gaining popularity as a prior step for constructing taxonomies, ontologies, automatic glossaries or dictionary entries. These fields of application motivate greater interest in well-formed encyclopedic text from which to extract definitions, and therefore DE for academic or lay discourse h...
متن کاملFramework Baseado em Ontologias para Publicação e Integração Semântica de Glossários (An Ontology-based Framework for Publishing, Semantic Integration of Glossaries)
Glossaries play a central role in improving semantic search services and aligning ontologies from heterogeneous data sources. This work deals with the semantic interoperability problem between glossaries and presents an framework based on ontologies for semantic integration of glossaries. The proposed framework is used to construct a mashup that aims to integrate terminologies of different glos...
متن کاملDefinition Coverage in the OBO Foundry Ontologies: The Big Picture
High quality ontologies have both textual and logical definitions for their terms. Definitions serve many purposes: good textual definitions allow for experts and non-experts alike to understand the content of an ontology and use it in the manner the authors intended; logical definitions are necessary for reasoners to verify that an ontology is consistent, and may make application of the ontolo...
متن کاملFormalizing biomedical concepts from textual definitions
BACKGROUND Ontologies play a major role in life sciences, enabling a number of applications, from new data integration to knowledge verification. SNOMED CT is a large medical ontology that is formally defined so that it ensures global consistency and support of complex reasoning tasks. Most biomedical ontologies and taxonomies on the other hand define concepts only textually, without the use of...
متن کامل